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Claim Amendments 

1. (Cancelled). 

2. (Currently Amended) The method of claim 4 6 further comprising: 
sorting each phrase of the set of frequently occurring phrases in inverse 

lexicographical order prior to filtering the set of frequently occurring phrases. 

3. (Currently Amended) The method of claim 4 6 wherein the text corpus is 
preprocessed. 

4. (Original) The method of claim 3 wherein the text corpus is text of a human 
language. 

5. (Original) The method of claim 4 wherein the human language is Chinese. 

6. (Currently Amended) Th e m e thod of claim A A method comprising: 
creating a suffix tree to determine the frequency of phrases within a text corpus: 
specifying a set of frequently occurring phrases: and 

filtering the set of freguently occurring phrases to determine a set of freouently 
occurring and unrecognized phrases as entity name and jargon term candidates. 
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wherein filtering the set of frequently occurring phrases includes comparing a 
component word of a phrase to a dictionary of common words and excluding the 
phrase from the set of entity name and jargon term candidates if the component word 
is a common word. 

7. (Original) The method of claim 4 further comprising: 

reducing the set of entity name and jargon term candidates by applying natural 
language processing rules. 

8. (Previously Presented) The method of claim 7 wherein the natural 
language processing rules are rules selected from the list consisting of morphological 
mles, semantic rules, and syntactic rules. 

9. (Cancelled). 

1 0. (Currently Amended) The machine-readable medium of claim 9 29 
wherein the method further comprises: 

sorting each phrase of the set of frequently occurring phrases in inverse 
lexicographical order prior to filtering the set of frequently occurring phrases. 

1 1 . (Currently Amended) The machine-readable medium of claim 9 29 
wherein the text corpus is preprocessed. 
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12. (Original) The machine-readable medium of claim 1 1 wherein the text 
corpus is text of a human language. 

13. (Original) The machine-readable medium of claim 12 wherein the human 
language is Chinese. 

14. (Original) The machine-readable medium of claim 12 wherein filtering the 
set of frequently occurring phrases includes comparing a component word of a phrase 
to a dictionary of common words and excluding the phrase from the set of entity name 
and jargon term candidates if the component word is a common word. 

15. (Original) The machine-readable medium of claim 12 wherein the method 
further comprises: 

reducing the set of entity name and jargon term candidates by applying natural 
language processing rules. 

16. (Previously Presented) The machine-readable medium of claim 15 
wherein the natural language processing rules are rules selected from the list 
consisting of morphological rules, semantic rules, and syntactic rules. 

17. (Cancelled). 
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18. (Currently Amended) The system of claim 47 22 wherein the operations 
further comprise: 

sorting each phrase of the set of frequently occurring phrases in inverse 
lexicographical order prior to filtering the set of frequently occurring phrases. 

1 9. (Currently Amended) The system of claim 47 22 wherein the text corpus is 
preprocessed. 

20. (Original) The system of claim 19 wherein the text corpus is text of a 
human language. 

21. (Original) The system of claim 20 wherein the human language is 
Chinese. 

22. (Currently Amended) Th e syst e m of claim 20 A system comprising: 

a memory having stored therein executable instructions which when executed 
by a processor cause the processor to perform operations comprising: 

creating a suffix tree data structure, the suffix tree data structure storing 
phrase freguencv data for a text corpus: 

using the phrase freguencv data to specify a set of freouentlv occurring 
phrases: and 
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filtering the set of frequently occurring phrases to determine a set of 
frequently occurring and unrecognized phrases as entity name and iargon 
term candidates : and 
a processor to execute the instructions, 

wherein filtering the set of frequently occurring phrases includes comparing a 
component word of a phrase to a dictionary of common words and excluding the 
phrase from the set of entity name and jargon term candidates if the component word 
is a common word. 

23. (Original) The system of claim 20 further comprising: 

reducing the set of entity name and jargon term candidates by applying natural 
language processing rules. 

24. (Previously Presented) The system of claim 23 wherein the natural 
language processing rules are rules selected from the list consisting of morphological 
rules, semantic rules, and syntactic rules. 

25. (Currently Amended) The method of claim 4 6, wherein filtering 
comprises: 

excluding a phrase from the set of frequently occumng phrases, wherein the 
phrase comprises a sub-phrase that occurs at a higher frequency than the phrase. 
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26. (Currently Amended) The machine-readable medium of claim 9 29 
wherein filtering comprises: 

excluding a phrase from the set of frequently occurring phrases, wherein tTie 
phrase comprises a sub-phrase that occurs at a higher frequency than the phrase. 

27. (Currently Amended) The system of claim 47 22 wherein filtering 
comprises: 

excluding a phrase from the set of frequently occurring phrases, wherein the 
phrase comprises a sub-phrase that occurs at a higher frequency than the phrase. 

28. (Currently Amended) The method of claim 4 6, wherein the filtering 
comprises: 

excluding an embedded phrase from the set of frequently occurring phrases, 
wherein the embedded phrase is embedded by an embedding phrase that occurs at a 
similar frequency with the embedded phrase. 

29. (Currently Amended) Th e mach i n e- r e adabl e m e d i um of cla i m 9 A 
machine-readable medium containing instructions which, when executed bv a 
processor, cause the processor to perform a method, the method comprising: 

creating a suffix tree to determine the freguencv of phrases within a text corpus: 
specifying a set of freguentiv occurring phrases: and 
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filtering the set of frequently occurring phrases to determine a set of frequently 
occurring and unrecognized phrases as entity name and jargon term candidates. 

wherein the filtering comprises[[:]] excluding an embedded phrase from the set 
of frequently occurring phrases, wherein the embedded phrase is embedded by an 
embedding phrase that occurs at a similar frequency with the embedded phrase. 

30. (Currently Amended) The system of claim 4? 22 wherein the filtering 
comprises: 

excluding an embedded phrase from the set of frequently occurring phrases, 
wherein the embedded phrase is embedded by an embedding phrase that occurs at a 
similar frequency with the embedded phrase. 
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